Teaching Linear Models, Machine Learning, and Deep Learning

The project involved teaching data science at SupAgro, with a focus on linear models, machine learning, and deep learning. The mission took place between January 2025 and August 2025, in collaboration with the MISTEA UMR (Joint Research Unit). The development was supervised by the Information Systems Department (DSI) of INRAE (National Research Institute for Agriculture, Food and Environment).

Tasks & Objectives

As a teaching assistant, my role involved teaching statistics, linear regression, and spatial statistics to master's students in agronomy. One of the main objectives was to create educational materials and facilitate the transition from R to Python for data science applications, particularly in the context of near-infrared spectroscopy (NIR).

Success criteria included not only the development of comprehensive educational materials but also the successful transition of students from R to Python. A key objective was to enhance the laboratory's capabilities in Python and large language models (LLMs). Finally, it was essential to develop a methodical approach to model evaluation and to understand the limitations of deep learning compared to traditional machine learning algorithms.

Actions and Development

My first step was to create Jupyter notebooks for teaching linear models, machine learning, and deep learning. I then developed methodologies for transitioning from R to Python, focusing on practical applications in NIR spectroscopy. Regular interactions with students through office hours and project reviews facilitated their learning.

Regular exchanges with the project, scientific, and IT teams, as well as with the former development team, facilitated my work. Collaboration with the MISTEA UMR was crucial for developing a common understanding of the educational materials and establishing a shared vocabulary. Despite the complexity of the subject matter and significant variations in student backgrounds, implementing a comprehensive curriculum represented a major challenge but also a learning opportunity.

Key decisions were made collectively during weekly teaching team meetings. For the educational materials, I presented a Proof of Concept (POC) before implementing the complete solution.

Results

The results are multiple: creation of reusable educational resources, positive interactions within the scientific community, and personal growth in data science and deep learning. The educational materials allowed for a better understanding of linear models, machine learning, and deep learning, while the transition from R to Python provided valuable skills for students. Additionally, the methodical approach to model evaluation enhanced students' understanding of the limitations and capabilities of different algorithms.

I learned to effectively communicate complex concepts, to design engaging and challenging projects, and to provide constructive feedback. Finally, the experience of teaching data science strengthened my understanding of the subject and improved my ability to explain complex ideas clearly.

Technical Stack

The technologies used include: Python, Jupyter Notebooks, NumPy, scikit-learn, PyTorch, as well as Markdown for documentation. For the educational materials, I chose to use a combination of Python and Jupyter notebooks, while other technological choices were made to align with the course objectives. The course, complex in terms of both theoretical and practical components, required mastery of both teaching and technical skills. Existing student knowledge gaps also posed a challenge, which I addressed by providing additional resources and support throughout the course. Finally, learning to effectively use Jupyter notebooks for teaching constituted an important step in improving the learning experience.